#run this line first if you have never used these packages before
#install.packages(c("tidyverse", "sf", "tmap", "readr", "here", "ggspatial", "ggrepel"))
library(tidyverse) #install the core tidyverse packages including ggplot2
library(sf) #provides tools to work with vector data
library(tmap) #for visualizing spatial data
library(readr) #functions for reading external datasets
library(here) #to better locate files not in working directory
library(ggspatial) #for finishing touches on a map
library(ggrepel) #for repeling text in ggplotWhy am I here?
Several years back, I was asked to show multiple variables on a map of the world. This is what started my R journey. I began exploring R functionality, and I stumbled through posts where authors did GIS in R. Then one day several years later, I found the amazing resource Geocomputation with R. With this I was able to explore all kinds of maps and some initial modeling. I was hooked! Would I have been better off starting with a more traditional stats focus and learning from R for Data Science? Definitely, but I wouldn’t have gotten lost in playing with data and making beautiful maps. The straightest path to a goal is one I rarely follow. So, here’s a quick introduction to making static and interactive maps in R.
Learning objectives
For this session, the learning objective is to:
- Make a map in ggplot and tmap
Important packages in rspatial
Similar to QGIS, R provides an open-source interface for GIS. The most commonly used packages to handle spatial data are sf for vectors, terra for vectors and rasters, and raster for rasters.
The most commonly used packages for visualizing spatial data are ggplot2 - R’s most famous visualization package - and tmap which is designed specifically for visualizing spatial data. This demonstration will provide code for both packages.
To get started, we need to load our packages.
Read in the data
For this exercise, the data is already saved in the main project folder so I’ll read in the csv file with city names and locations, and then I’ll read in the administrative boundaries. You can also access the city data here and the administrative boundaries here.
#It is a csv file so I use the read_csv function and provide the file path
cities <- read_csv(here::here("posts/map/Madagascar_Cities.csv")
, show_col_types = FALSE)
#Observe the first few rows of data
DT::datatable(head(cities))Now, for the administrative boundaries. Each of these are being read in using st_read() from the sf package. These are by default, spatial (sf) objects already.
#This is only the country boundary
mdg <- st_read(here::here("posts/map/shapefiles/mdg_admbnda_adm0_BNGRC_OCHA_20181031.shp"))Reading layer `mdg_admbnda_adm0_BNGRC_OCHA_20181031' from data source
`C:\Users\brian\OneDrive\Documents\website\posts\map\shapefiles\mdg_admbnda_adm0_BNGRC_OCHA_20181031.shp'
using driver `ESRI Shapefile'
Simple feature collection with 1 feature and 3 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 43.17692 ymin: -25.60575 xmax: 50.48485 ymax: -11.95139
Geodetic CRS: WGS 84
#This is the administrative level below the whole country
mdg1 <- st_read(here::here("posts/map/shapefiles/mdg_admbnda_adm1_BNGRC_OCHA_20181031.shp"))Reading layer `mdg_admbnda_adm1_BNGRC_OCHA_20181031' from data source
`C:\Users\brian\OneDrive\Documents\website\posts\map\shapefiles\mdg_admbnda_adm1_BNGRC_OCHA_20181031.shp'
using driver `ESRI Shapefile'
Simple feature collection with 22 features and 9 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 43.17692 ymin: -25.60575 xmax: 50.48485 ymax: -11.95139
Geodetic CRS: WGS 84
#This is the administrative level below the previous one
# maybe these are districts
mdg2 <- st_read(here::here("posts/map/shapefiles/mdg_admbnda_adm2_BNGRC_OCHA_20181031.shp"))Reading layer `mdg_admbnda_adm2_BNGRC_OCHA_20181031' from data source
`C:\Users\brian\OneDrive\Documents\website\posts\map\shapefiles\mdg_admbnda_adm2_BNGRC_OCHA_20181031.shp'
using driver `ESRI Shapefile'
Simple feature collection with 119 features and 13 fields
Geometry type: MULTIPOLYGON
Dimension: XY
Bounding box: xmin: 43.17692 ymin: -25.60575 xmax: 50.48485 ymax: -11.95139
Geodetic CRS: WGS 84
Convert the cities to an sf object
Remember that the cities object is a standard .csv with longitude and latitude columns, but it is not yet recognized as a spatial object. Let’s convert it to an sf object with a single geometry column and a crs.
cities_sf <- cities |>
st_as_sf(coords = c("Longitude", "Latitude")
, crs = 4326)
#observe the first few rows of data
DT::datatable(head(cities_sf))Make the map
Now the fun begins! To initiate tmap, we need to specify whether we want a static map (“plot”) or an interactive map (“view”). Then, we pass in the shape object and tell it what kind of shape we want. Then, we place the cities_sf on top of the Madagascar shape and give the dots a reasonable size and a color that stands out. Notice that just like ggplot the map is built in layers.
The second tab below shows a similar, though shorter, process for ggplot. In this case, and we can see some differences in the default settings between the two packages.
tmap_mode("plot") +
tm_shape(mdg) +
tm_polygons() + #for only the borders, use tm_borders()
tm_shape(cities_sf) +
tm_dots(size = .25, col = "red", fill = "red")
ggplot2::ggplot(mdg) +
geom_sf() +
geom_sf(data = cities_sf, color = "red") +
theme_void()
Make the map better
The map could use a title and the cities could use labels. Since we chose Madagascar the cities have long names and require us to create a bigger viewing window by adjusting the bounding box (bbox). Luckily it is not that hard. We can extract the current bbox using st_bbox() and then calculate the ranges of x and y values. We don’t need to completely max out viewing window so multiplying the bbox values by a percentage of each range does the trick. This may take some experimenting to see what works.
The x values are stored in slots 1 and 3 of the bbox and the y values in slots 2 and 4. Then, just make sure that the bbox_new object is an sf polygon and set the bbox to use it. From here, the layering process is the same, but there’s a new layer for the city labels and a title.
Notice that tmap naturally avoids text overlapping, but with ggplot we have to use ggrepel to explicitly repel the label names from each other.
#the city names are long so we have to
# make a bigger window to fit them. This isn't part of the normal process
#make an object with the current bounding box
bbox_new <- st_bbox(mdg)
#calculate the x and y ranges of the bbox
xrange <- bbox_new$xmax - bbox_new$xmin # range of x values
yrange <- bbox_new$ymax - bbox_new$ymin # range of y values
#provide the new values for the 4 corners of the bbox
bbox_new[1] <- bbox_new[1] - (0.7 * xrange) # xmin - left
bbox_new[3] <- bbox_new[3] + (0.75 * xrange) # xmax - right
bbox_new[2] <- bbox_new[2] - (0.1 * yrange) # ymin - bottom
bbox_new[4] <- bbox_new[4] + (0.1 * yrange) # ymax - top
#convert the bbox to a sf collection (sfc)
bbox_new <- bbox_new |> # take the bounding box ...
st_as_sfc() # ... and make it a sf polygon
#now plot the map
tmap_mode("plot") +
tm_shape(mdg, bbox = bbox_new) +
tm_polygons() +
tm_shape(cities_sf) +
tm_dots(size = .25, col = "red", fill = "red") +
tm_text(text = "Name", auto.placement = T, size = .6) +
tm_layout(title = "Main Cities of\nMadagascar")
#the city names are long so we have to
# make a bigger window to fit them. This isn't part of the normal process
#make an object with the current bounding box
bbox_new <- st_bbox(mdg)
#calculate the x and y ranges of the bbox
xrange <- bbox_new$xmax - bbox_new$xmin # range of x values
yrange <- bbox_new$ymax - bbox_new$ymin # range of y values
#provide the new values for the 4 corners of the bbox
bbox_new[1] <- bbox_new[1] - (0.5 * xrange) # xmin - left
bbox_new[3] <- bbox_new[3] + (0.5 * xrange) # xmax - right
bbox_new[2] <- bbox_new[2] - (0.1 * yrange) # ymin - bottom
bbox_new[4] <- bbox_new[4] + (0.1 * yrange) # ymax - top
#convert the bbox to a sf collection (sfc)
bbox_new <- bbox_new |> # take the bounding box
st_as_sfc() # ... and make it a sf polygon
ggplot2::ggplot() +
geom_sf(data = mdg) +
geom_sf(data = cities_sf, color = "red") +
ggrepel::geom_text_repel(data = cities_sf
, aes(label = Name
, geometry = geometry)
, stat = "sf_coordinates"
, min.segment.length = 0) +
coord_sf(xlim = st_coordinates(bbox_new)[c(1,2),1], # min & max of x values
ylim = st_coordinates(bbox_new)[c(2,3),2]) + # min & max of y values +
labs(title = "Main Cities of\nMadagascar") +
theme_void()
Final touches
Now that we have a map with cities plotted (we achieved our goal!), we will add a few finishing touches and set the size of the city points to the population variable in the original dataset.
Additionally, tmap provides a simple interface to go from a static map to an interative map simply by changing tmap_mode("plot") to tmap_mode("view").
#the city names are long so we have to
# make a bigger window to fit them. This isn't part of the normal process
#make an object with the current bounding box
bbox_new <- st_bbox(mdg)
#calculate the x and y ranges of the bbox
xrange <- bbox_new$xmax - bbox_new$xmin # range of x values
yrange <- bbox_new$ymax - bbox_new$ymin # range of y values
#provide the new values for the 4 corners of the bbox
bbox_new[1] <- bbox_new[1] - (0.7 * xrange) # xmin - left
bbox_new[3] <- bbox_new[3] + (0.75 * xrange) # xmax - right
bbox_new[2] <- bbox_new[2] - (0.1 * yrange) # ymin - bottom
bbox_new[4] <- bbox_new[4] + (0.1 * yrange) # ymax - top
#convert the bbox to a sf collection (sfc)
bbox_new <- bbox_new |> # take the bounding box ...
st_as_sfc() # ... and make it a sf polygon
tmap::tmap_mode("plot") +
tmap::tm_shape(mdg, bbox = bbox_new) +
tmap::tm_polygons() +
tmap::tm_shape(cities_sf) +
tmap::tm_dots(size = "Population", col = "red", fill = "red"
, legend.size.is.portrait = TRUE) +
tmap::tm_text(text = "Name", auto.placement = T
, along.lines = T, size = .6) +
#tmap::tm_scale_bar(position = c("left", "bottom"), width = 0.15) +
tmap::tm_compass(type = "4star"
, position = c("right", "bottom")
, size = 2) +
tmap::tm_layout(main.title = "Main Cities of Madagascar" , legend.outside = TRUE)
ggplot2::ggplot() +
geom_sf(data = mdg) +
geom_sf(data = cities_sf, aes(size = Population)
, color = "red") +
ggrepel::geom_text_repel(data = cities_sf
, aes(label = Name
, geometry = geometry)
, stat = "sf_coordinates"
, min.segment.length = 0) +
coord_sf(xlim = st_coordinates(bbox_new)[c(1,2),1], # min & max of x values
ylim = st_coordinates(bbox_new)[c(2,3),2]) + # min & max of y values +
ggspatial::annotation_scale(location = "bl") +
ggspatial::annotation_north_arrow(location = "br"
, which_north = "true"
, size = 1)+
labs(title = "Main Cities of Madagascar") +
theme_void()
tmap_mode("view") +
tm_shape(mdg) +
tm_borders() +
tm_shape(cities_sf) +
tm_dots(size = "Population", col = "red"
, legend.size.is.portrait = TRUE) +
tm_text(text = "Name", auto.placement = T
, along.lines = T) +
#tm_scale_bar(position = c("left", "bottom"), width = 0.15) +
tm_compass(type = "4star"
, position = c("right", "bottom")
, size = 2) +
tm_layout(main.title = "Main Cities of Madagascar" , legend.outside = TRUE)Additional Resources
This is just an initial primer. For those interested in mapping in R there are many free resources available online. A great starting point for R is the online text book, Geocomputation with R. If you would rather learn more in Python, Geocomputation with Python is a great resource.